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1 . I was the prosecuting attorney who filed the above-identified application. 

2. On September 9, 2003, my firm received a copy of the patent application 
containing the subject matter of the present claims. Shown in Exhibit A along with the 
accompanying email are excerpts from numbered paragraphs [0054] - [0104] of this copy 
describing the basic subject matter defined in the claims, especially the subject matter 
associated with providing updated centering coefficients. The excerpts are nearly identical 
(except for format) to those corresponding sections of the filed application. 

3. On September 10, 2003, my firm received in our office a final revised patent 
application from the Applicant, including authorization to review and proceed with filing. 
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4. On September 11, 2003, work proceeded here in our office to prepare the requisite 
filing papers. Attached in Exhibit C are metadata analysis files showing creation of the filing 
receipt, priority document, and fee calculation documents on September 11, 2003 for the 
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showing a creation date of September 11, 2003 for the docket number of the present 
application. 
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Ron, 



Attached is an amended draft of the above identified case. I have adopted all of Ed's recommended changes and 
added a few others (highlighted in blue). Please review and prepare for filing. The current draft addresses all of 
the inventor's comments. 

Regards 

Eric 
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Eric Strang, Ph.D. 

Principal Engineer, 
Intellectual Property 
Tokyo Electron America 
2120 West Guadalupe Road 
Gilbert, A2 85233 
Direct: (480) 507-4836 
Mobile: (602) 647-2504 
Fax: (480)507-3115 

This message, together with any attachments, is intended only for the use of the individual or entity to 
which it is addressed and may contain information that is legally privileged, confidential and exempt 
from disclosure. If you are not the intended recipient, you are hereby notified that any dissemination, 
distribution, or copying of this message, or any attachment, is strictly prohibited. If you have received 
this message in error* please notify the original sender or the Tokyo Electron America, Inc. IP Group 
(480-507-4836) immediately by telephone or by return E-mail and delete this message, along with any 
attachments, from your computer. Thank you. 



2/12/2007 



window (not shown) to plasma processing region 45. A frequency for the 
application of RF power to the inductive coil 80 preferably ranges from 10 
MHz to 100 MHz and is preferably 13.56 MHz. Similarly, a frequency for the 
application of power to the chuck electrode preferably ranges from 0.1 MHz to 
30 MHz and is preferably 13.56 MHz. In addition, a slotted Faraday shield 
(not shown) can be employed to reduce capacitive coupling between the 
inductive coil 80 and plasma. Moreover, controller 55 can be coupled to RF 
generator 82 and impedance match network 84 in order to control the 
application of power to inductive coil 80. In an alternate embodiment, 
inductive coil 80 can be a "spiral" coil or "pancake" coil in communication with 
the plasma processing region 45 from above as in a transformer coupled 
plasma (TCP) reactor. 

[0051] Alternately, the plasma can be formed using electron cyclotron 
resonance (ECR). In yet another embodiment, the plasma is formed from the 
launching of a Helicon wave. In yet another embodiment, the plasma is 
formed from a propagating surface wave. 

[0052] As discussed above, the process performance monitoring system 100 
includes plurality of sensors 50 and controller 55, where the sensors 50 are 
coupled to process tool 10 and the controller 55 is coupled to the sensors 50 
to receive tool data. The controller 55 is further capable of executing at least 
one algorithm to optimize the tool data received from the sensors 50, 
determine a relationship (model) between the tool data, and use the 
relationship (model) for fault detection. 

[0053] When encountering large sets of data involving a substantive number 
of variables, multivariate analysis (MVA) is often applied. For example, one 
such MVA technique includes Principal Components Analysis (PCA). In PCA, 
a model can be assembled to extract from a large set of data, a signal 
exhibiting the greatest variance in the multi-dimensional parameter space. 
[0054] For example, each set of data parameters for a given substrate run, 

or instant in time, can be stored as a row in a matrix A" and, hence, once the 

matrix X is assembled, each row represents a different substrate run, or 
instant in time (or observation), and each column represents a different data 
parameter (or data variable) corresponding to the plurality of sensors 50. 



14 



Therefore, matrix X is a rectangular matrix of dimensions q by r, where q 
represents the row dimension and r represents the column dimension. Once 
the data is stored in the matrix, the data is generally mean-centered and/or 
normalized. The process of mean-centering the data stored in a matrix 
column involves computing a mean value of the column elements and 
subtracting the mean value from each element. Moreover, the data residing 
in a column of the matrix can be normalized by determining the standard 
deviation of the data in the column. 

[0055] Using the PCA technique, the correlation structure within matrix X is 
determined by approximating matrix X with a matrix product (TP T ) of lower 
dimensions plus an error matrix E , viz. 
[0056] X = TP 7 + E , 

(1a) 
[0057] where 

[0058] Xu = 

(1b) 

[0059] T represents the i 01 row, "j" represents the j* column, subscript "M" 
represents mean value, a represents standard deviation, X is a the raw 
data, T is a (q by p) matrix of scores that summarizes the X -variables, and 
P is a (r by p, where p<r) matrix of loadings showing the influence of the 
variables. 

[0060] In general, the loadings matrix P can be shown to comprise the 
eigenvectors of the covariance matrix of X , where the covariance matrix S 
can be shown to be 

[0061] S = X T X. (2) 

[0062] The covariance matrix S is a real, symmetric matrix and, therefore, it 
can be described as 

[0063] S = UAU T , (3) 

[0064] where the real, symmetric eigenvector matrix U comprises the 
normalized eigenvectors as columns and A is a diagonal matrix comprising 
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the eigenvalues corresponding to each eigenvector along the diagonal. Using 
equations (1a) and (3) (for a full matrix of p=r; i.e. no error matrix), one can 
show that 

[0065] P = U (4) 

[0066] and 

[0067] f T f = A . (5) 

[0068] A consequence of the above eigen-analysis is that each eigenvalue 
represents the variance of the data in the direction of the corresponding 
eigenvector within n-dimensional space. Hence, the largest eigenvalue 
corresponds to the greatest variance in the data within the multi-dimensional 
space whereas the smallest eigenvalue represents the smallest variance in 
the data. By definition, all eigenvectors are orthogonal, and therefore, the 
second largest eigenvalue corresponds to the second greatest variance in the 
data in the direction of the corresponding eigenvector, which is, of course, 
normal to the direction of the first eigenvector. In general, for such analysis, 
the first several (three to four, or more) largest eigenvalues are chosen to 
approximate the data and, as a result of the approximation, an error E is 
introduced to the representation in equation (1a). In summary, once the set of 
eigenvalues and their corresponding eigenvectors are determined, a set of the 
largest eigenvalues can be chosen and the error matrix E of equation (1a) 
can be determined. 

[0069] An example of commercially available software which supports PCA 
modeling is MATLAB™ (commercially available from The Mathworks, Inc., 
Natick, MA), and PLS Toolbox (commercially available from Eigenvector 
Research, Inc., Manson, WA). 

[0070] Additionally, once a PCA model is established, commercially 
available software, such as MATLAB™, is further capable of producing as 
output other statistical quantities such as the Hotelling T 2 parameter for an 
observation, or the Q-statistic. The Q-statistic for an observation can be 
calculated as follows 

[0071] Q = E T E, 

(6a) 
[0072] where 
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[0073] E = x[l-PP T \ 

(6b) 

[0074] and I is the identity matrix of appropriate size. For example, a PCA 

model (loadings matrix P , etc.) can be constructed using a "training" set of 

data (i.e. assemble X for a number of observations and determine a PCA 
model using MATLAB™). Once the PCA model is constructed, projections of 
a new observation onto the PCA model can be utilized to determine a residual 

matrix E, as in equation (1). 

[0075] Similarly, the Hotelling T 2 can be calculated as follows 
[0076] T ^ = ±^ f 

(7a) 
[0077] where 

[0078] T = XP, 

(7b) 

[0079] and T ia is the score (from equation (7b)) for the i m observation 
(substrate run, instant in time, etc.; i.e., i=1 to q) and the 3 th model dimension 

(i.e., a=1 to p), and s 2 ta is the variance of T a . For example, a PCA model 

(loadings matrix P , etc.) can be constructed using a "training" set of data (i.e. 

assemble X for a number of observations and determine a PCA model using 
MATLAB™). Once the PCA model is constructed, projections of a new 
observation onto the PCA model can be utilized to determine a new scores 

matrix T . 

[0080] Typically, a statistical quantity, such as the Q-statistic, or the Hotelling 
T 2 , is monitored for a process, and, when this quantity exceeds a pre- 
determined control limit, a fault for the process is detected. 
[0081] FIG. 6A shows an example of conventional use of a PCA model to 
monitor the Q-statistic (Q-factor) of a process in order to determine faults in 
the process. In the example of FIG. 6A, the model is applied to process data 
acquired from Unity II DRM (Dipole Ring Magnet) CCP (Capacitively Coupled 
Plasma) processing systems (commercially available from Tokyo Electron 
Limited; see FIG. 3) that perform a patterned oxide etch with a C 4 F8/CO/Ar + 
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0 2 based chemistry. This processing system operates in a batch mode with a 
fixed process recipe for each lot. Typically, a single recipe is utilized from lot 
to lot for a particular process step in the manufacture of a device. The same 
processing system is frequently utilized for many different device layers and 
steps, but for each process step, the recipe remains the same. 
[0082] The data parameters collected include the chamber pressure, applied 
power, various temperatures, and many other variables relating to the 
pressure, power, and temperature control as shown in Table 1 . 
[0083] The process recipe used in this example has three main steps: a 
photoresist cleaning step, a main etch step, and a photoresist stripping step. 
The scope of this example applied to the main etch step, but it is not limited to 
this particular step or any particular step and is, therefore, applicable to other 
steps as well. 

[0084] For each process step, an observation mean and observation 
standard deviation of a time trace for each data parameter (or tool variable) 
was calculated from roughly 160 samples for each substrate. The beginning 
portion of the time trace for each data parameter, where the RF power 
increases, was trimmed in these statistical calculations in an attempt to 
remove the variation due to the power when it is turned on. 
[0085] In the example of FIG. 6A, a PCA model was performed for the first 
500 substrates using the same recipe in a single processing system. The 
standard PCA methods implemented in MATLAB™ were used, with mean 
centering and unit variance scaling. Also, the standard Q residuals (SPE) and 
Q contributions were calculated using the Eigenvector Research PLS Toolbox 
offered by Eigenvector Research as an add-on to MATLAB™. 
[0086] In the example of FIG. 6A, the PCA model was constructed from the 
first 500 substrates in a first processing system and was applied to all 3200 
substrates from this processing system. As seen in this figure, the resulting Q 
statistic exceeds the 95% confidence limit in the model within less than 250 
substrates after the PCA model was built (i.e. by substrate number 750), and 
never returns to below that level. In addition, distinct outliers and distinct 
step-like changes are apparent. Thus, FIG. 6A demonstrates that while a 
conventional PCA model constructed as described above can be used to 
monitor the Q-statistic, there exist periods of time where the statistical 
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parameter deviates above the control limit never to return below. Indeed, any 
of the above described statistics (e.g., the Q-statistic, or the Hotelling T 2 
parameter) can be monitored using a given model for a specific process in a 
specific processing system, but will eventually deviate above the control limit 
never to return below. Thereafter, the model is no longer applicable to the 
given process and given processing system. 

[0087] While methods are known for preserving the usefulness of the PCA 
model over long process runs, the present inventors have recognized that 
these methods are not practical for commercial application to semiconductor 
manufacturing process control. For example, using an adaptive model 
technioue. the PCA model can be actually rebuilt with each process run in 
order to update the model on the fly during the process. While this adaptive 
modeling technioue may generally stabilize the statistical monitoring within a 
given control limit, it reguires computational resources not practical for 
commercial processes. 

[0088] Another technique for maintaining the usefulness of the statistical 
monitoring of FIG. 6A is to employ a more complicated control limit scheme. 
Specifically, the control limit can be reset for each process run based on a 
predicted degradation of the PCA model. While this method will avoid the 
indication of an out-of-process condition due to degradation of the PCA 
model, changing the control limit with each process run requires a complex 
scheme that is also impractical for commercial processes. 
[0089] Thus, the present inventors have recognized that conventional 
methods for adapting a PCA model to enable statistical monitoring over long 
process runs is impractical for commercial processes. More specifically, the 
present inventors have discovered that the standard approach to centering 
and scaling the data in a PCA matrix has not enabled the development of a 
robust model capable of use for long periods of time (i.e., substantive number 
of substrate runs). 

[0090] In an embodiment of the present invention, an adaptive multivariate 
analysis is described for preparing a robust PCA model. Therein, the 
centering and scaling coefficients are updated using an adaptation scheme. 
The mean values (utilized for centering) for each summary statistic are 
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updated from one observation to the next using a filter, such as an 
exponentially weighted moving average (EWMA) filter shown as follows: 

[0091] X M ,j\n = XX M ,j^ +(\-X)Xj,n, (8) 

[0092] where Xmj* represents the calculated model mean value ("M") of 

the f 1 data parameter at the current run (or observation "n"), X M j, n -\ 
represents the calculated model mean value ("M") of the j* data parameter at 
the previous run (or observation "n-1"), Xj* represents the current value of 
the f 1 data parameter for the current run, and X is a weighting factor ranging 
from a value of 0 to 1 . For example, when X=1 , the model mean value utilized 
for centering each data parameter is the previously used value, and, when 
X=0, the~mbdel mean value utilized for centering each data parameter is the 
current measured value. 

[0093] The model standard deviations (utilized for scaling) for each summary 
statistic are updated using the following recursive standard deviation filter 

[0094] a x , u = J(<r XJJ rt J + ± " J < < 9 > 



[0095] where cr x jjn represents the calculated model standard deviation of 
the data parameter for the current run (or observation "n"), a> 

represents the calculated model standard deviation of the j" 1 data parameter 
for the previous run (or observation "n-1 "), n represents the run (or 
observation) number, and k represents a filter constant. The filter constant k 
can, for example, be selected as a constant less than or equal to N, where N 
represents the number of substrate runs, or observations, utilized to construct 
the PCA model. 

TABLE 1 . 
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Area 


Variable 


Description 


Gas Flow and 
Pressure 


PRESSURE 


Chamber Pressure 


APC 


Throttle Valve Angle 


Ar 


At Flow Rate 


C4F8 


C4F8 Flow Rate 


CO 


CO Flow Rate 


Power and 
Matching 


RF-FORWARD-LO 


Lower Electrode Power 


Cl-POSITION-LO 


Matching Network Capacitor 1 


C2-POSITION-LO 


Matching Network Capacitor 2 


MAGNITUDE 


Matcher Magnitude 


PHASE 


Matcher Phase 


RF-VDC-LO 


Lower Electrode DC Voltage 


RF-VPP-LO 

i 


Lower Electrode Peak to Peak Voltage 

- 


ES Chuck 


ESC-CURRENT 


Electrostatic Chuck Current 


ESC-VOLTAGE 


Electrostatic Chuck Voltage 


Temperature 
and Cooling 


LOWER-TEMP 


Lower Electrode Temperature 


UPPER-TEMP 


Upper Electrode Temperature 


WALL-TEMP 


Wall Temperature 


COOL-GAS-FLOW 1 


He Edge Cooling Flow Rate 


COOL-GAS-FLOW2 


He Center Flow Rate 


COOL-GAS-PI 


He Edge Cooling Gas Pressure 


COOL-GAS-P2 


He Center Cooling Gas Pressure 



[0096] FIG. 6B shows the same example of using a PCA model to monitor 
the Q-statistic that was presented in FIG. 6A, except that the centering and 
scaling coefficients are updated using an adaptation scheme in accordance 
with the present invention. As seen in this figure, after the first 500 wafers, 
when the centering and scaling constants are adapted using adaptive 
centering and scaling coefficients described above (A=0.92; k=500), the Q- 
statistic chart is substantially more stable across all of the remaining 
substrates, and the data predominantly resides within the same limit. The 
inventive adaptation scheme provides similar improvement to other statistical 
monitoring schemes (e.g., the Hotelling T 2 parameter). Thus, adaptation of 
the PCA model in accordance with the present invention allows for a more 
robust PCA model that can be used for long process runs. 
[0097] Referring now to FIGs. 6A and 6B together, the first excursion of 
substantive magnitude is the run with the largest Q value in the adaptive case, 
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which occurs for substrate 1492. In the residual contribution plots for both the 
static and adaptive cases (see FIG. 7), C1-POSITION-LO mean, RF-VPP-LO 
mean, and ESC-CURRENT are the extreme values. The arbitrarily scaled 
summary statistics for the latter two data parameters are plotted in FIG. 8. 
These three data parameters account for the large spikes in the data at four 
points, which could indicate an issue with the impedance match network 
system. This type of outlier is clear in both Q charts, but only the adaptive 
case allows for a fixed limit (e.g., 95% confidence limit) for all time. 
[0098] In another embodiment, the relative change in the centering and 
scaling coefficients can be calculated to alert the operator or engineer that 
step summary statistics have shifted between two runs, or observations. For 
each centering coefficient, this is done by subtracting the estimate at an initial 
run from the estimate at a final run, then scaling each difference by the 
standard deviation used for scaling that step statistic for the initial run, viz. 

XmJJ> — X MJ t t 



[0099] M- = 



(10) 

[00100] where M- is the model mean movement metric, X M j# represents 

the model mean value for the j* data parameter for the 3 th substrate, Xmj* 
represents the model mean value for the j* data parameter for the b m 
substrate, and cr. represents the model standard deviation for the data 

parameter for the a* 1 substrate. 

[00101] For the scaling coefficient, the calculation is the difference in standard 
deviations scaled with the mean used for centering that step statistic, viz. 



[00102] M a = 



MJ,a 



(11) 

[00103] where a M represents the model standard deviation for the j m data 
parameter for the b m substrate. 

[00104] These results are then displayed in a Pareto chart to identify the 
variables that exhibited the largest relative change during the period. For 
example, this supplement to the typical contribution plot can give the operator 
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insight on the global changes in the set of data parameters. In contrast, the 
contribution plot indicates the local deviation in a particular run. 
[00105] Referring again to FIGs. 6A and 6B, the next type of excursion is 
observed at steps in the input summary data. In the static case, these 
excursions are clearly evident in the Q chart, although automating detection of 
these changes proves to be quite difficult. In the adaptive case, there are only 
4 periods where the Q statistic violates the cpnfidence limit for more than 5 
consecutive substrates (starting at substrates 1880, 2535, 2683, and 2948). 
When the model mean movement metric is calculated about each of these 
four periods (from the substrate before the period to the substrate after the 
period), the most extreme values occur for 1880 and 2946 on C1-POSITION- 
LO mean and WALL-TEMP mean, respectively. FIG. 9A presents the model 
mean movement metric and the model standard deviation metric for all of the 
data parameters. The arbitrarily scaled summary data for the two data 
parameters is displayed in FIG. 9B. The two major changes in the Q statistic 
seem to be dominated by these two data parameters. For example, the shift 
in these data parameters may have been caused by a tool cleaning, e.g., 
replacing key parts and changing the electrical or heat transfer characteristics 
of the processing system. Although the temperature is regulated in the 
processing system, this is done only at the upper electrode and walls. The 
lower temperature is not controlled and could be affected by different 
materials or part configurations in the processing system. The contribution 
plots for the static case and the adaptive case for substrate 1880 both are 
dominated by the C1-POSITION-LO. For substrate 2948, WALL-TEMP is the 
dominant contribution in the adaptive case, but in the static case it is only 
slightly larger than the C1-POSITION-LO value (which does not change at this 
run). 

[00106] In addition to providing a more robust PCA model that can be used 
for statistical monitoring over long process runs, the adaptive technique also 
provides use of the same PCA model among different processing systems. 
FIGs. 10 and 1 1 illustrate a second example of the present invention wherein, 
after looking at the major changes overtime for one processing system, the 
same model from the first 500 substrates was then applied to a set of 800 
substrates from a second processing system. As seen in FIG. 10, the plot of 
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From: TEA Strang, Eric [estrang@phx.telusa.com] 
Sent: Wednesday, September 10, 2003 6:51 PM 

To: Ron Rudder; Ellen Currier 

Cc: pcalabrese@phx.telusa.com 
Subject: RE: ES-004 

Categories: Folder: GroupWise ArchiveYHNBOX 
Ron, 

Attached is an amended draft for the above identified case. I have accepted your recommended changes, and 
made a few additional changes (highlighted in blue). Also, regarding the abstract, I will defer to your 
recommended approach (please proceed). Otherwise, please review and prepare for filing. 

Regards 

Eric 

«ES-004-Application-ES-09102003.doc.pgp» 

Eric Strang, Ph.D. 

Principal Engineer, 
Intellectual Property 
Tokyo Electron America 
21 20 West Guadalupe Road 
Gilbert, AZ 85233 
Direct: (480) 507-4836 
Mobile: (602) 647-2504 
Fax: (480)507-3115 
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Ronald Rudder 



From: Thuy B. Luu 

Sent: Tuesday, February 1 3, 2007 6:47 AM 
To: Ronald Rudder 
Subject: 242662US 
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Ron, 

Here is the entry that was performed by you under docket 242662US 

9/12/03 - Completed filing package and final review of the application 
$122.50 
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<Ihuy <B. Luu 



Billing Supervisor 
Obion, Spivak, McClelland, Maier & Neustadt, P.C. 
703-412-6449 
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